Curse of the dimension
https://gyazo.com/a584556b87390d4fbad2747e235a0017
Humans have difficulty imagining beyond 2-4 dimensions.
A lot of unexpected phenomena happen when you get dimensional.
A point within distance 1 from the origin in one dimension is half of a point within distance 2
1/4 in 2 dimensions
1/8 in 3D
...and the "percentage of close points" decreases exponentially as the dimension increases.
The number of samples required for sampling increases exponentially.
In the case of machine learning, increasing the dimensionality of the machine deteriorates the accuracy.
Because the effect of insufficient sample size is more overwhelming than the improvement in accuracy due to the additional dimension.
https://gyazo.com/934a40866acc18c6b266fdbb0c8b1ac2
For 3 or more dimensions, the vector length mode is non-zero.
Condition that each axis follows a standard normal distribution with mode 0
This is related to "most points are far from the center."
Almost all vectors are orthogonal
1000000 If you want to find the percentage of samples with a cosine similarity greater than 1/2,
0.06 (about 1/17) in 10 dimensions,
0.01 (about 1/100) in 20 dimensions,
0.0021 (about 1/480) in 30 dimensions,
0.00042 (about 1/2400) in 40 dimensions
100 In 100 dimensions, there were no points in the 10,000,000 sampled points where the cosine similarity was greater than 1/2.
Of course, in two dimensions, 33%.
relevance
If you take two random vectors in a high-dimensional space, the probability that they are nearly the same direction is very small compared to the probability that they are nearly orthogonal
As the number of dimensions (number of evaluation axes) increases, the probability of a state of complete superiority of one person's skills over another's decreases.
100% in 1D, 50% in 2D, 25% in 3D
https://gyazo.com/1b7ed946d22e1cceca40118b9cc7ee6f
Almost every stop is a [saddle point
99.8% in 10 dimensions
There are few cases where only one particular axis is larger than the other.
---
This page is auto-translated from /nishio/次元の呪い using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. I'm very happy to spread my thought to non-Japanese readers.